WEBVTT 00:00:00.000 --> 00:00:09.000 Thanks. Hi, everyone. My name is Chris Frayland I'm. 00:00:09.000 --> 00:00:16.000 A librarian at the Internet archive I want to welcome you to today's library as laboratory session. 00:00:16.000 --> 00:00:27.000 This is our series for showing off the innovative research projects and the scholars who were using Internet archive services or collections in their digital humanities projects. 00:00:27.000 --> 00:00:39.000 Our last session we had 2 of the Internet archives, top web archiving experts, Jefferson, Bailey and Helga Hoitzmann give us the basics of web archiving. and They answered that or They worked to answer 00:00:39.000 --> 00:00:44.000 the bold question, What can you do with billions of archived web pages? 00:00:44.000 --> 00:00:55.000 If you miss that session, the Session recording and the recap are now available, and Duncan, who's working behind the scenes today along with Caitlin Pillow, both will share the link to that in 00:00:55.000 --> 00:01:02.000 chat. I see that there. Thank you. So as we get started today, I want to share a few reminders. 00:01:02.000 --> 00:01:06.000 We do have automated captions available for today's discussion you can turn those on you. 00:01:06.000 --> 00:01:13.000 The live transcript feature of Zoom. We also have the ability for people to copy the things that are shared in chat. 00:01:13.000 --> 00:01:19.000 You can also view the entire transcript of everything that said using that live transcript feature. 00:01:19.000 --> 00:01:26.000 So if there's something that you want to follow back on that whole, the entire transcript is available for you to use. 00:01:26.000 --> 00:01:35.000 Our chat is open. please do be respectful in your communications today. Use the chat Here's the way that i'd like you to use the chat. 00:01:35.000 --> 00:01:42.000 Use it for running commentary. Keep it on topic but you know let's have a conversation about what's happening in the chat, and then you use the Q. 00:01:42.000 --> 00:01:47.000 And a feature to submit questions for the panelist to ask to answer. 00:01:47.000 --> 00:01:54.000 When we have set times for Q and A. we're gonna do follow-up questions after each speaker today. 00:01:54.000 --> 00:02:00.000 And then we also have a bit of time at the end of our session, which is going to run for 90 min today to ask questions. 00:02:00.000 --> 00:02:05.000 So, for now it'd be great to use the chat to drop in a Hello! 00:02:05.000 --> 00:02:10.000 Let us know what who you are and where you're tuning in from today. 00:02:10.000 --> 00:02:15.000 Seeing that there's the chat cannot be scraped and save to a text file. 00:02:15.000 --> 00:02:20.000 Nor is the option to save the chat available. It should be if you click on the the live transcript. 00:02:20.000 --> 00:02:27.000 That'll give you the that gives the the transcript gives you the live transcript. 00:02:27.000 --> 00:02:32.000 I think, within the chat we can double check to make sure that you can copy it. 00:02:32.000 --> 00:02:43.000 We save the chat and it's available and everybody who participates in today's session will get an email with a link to the video recording and to all the resources that we share. 00:02:43.000 --> 00:02:48.000 So no worries everything that we're saying here will be made available after the session. 00:02:48.000 --> 00:02:52.000 So please do. let us know where you're tuning in from today. 00:02:52.000 --> 00:03:03.000 So as part of our series, we want to bring you New and emerging projects, projects and late breaking news, and we're going to start off today session with a quick update from Quinn Dombowski of 00:03:03.000 --> 00:03:07.000 Stanford University, and I believe I saw this earlier on Twitter. 00:03:07.000 --> 00:03:16.000 That Quinn is about to give the first public presentation about a new project saving Ukrainian Cultural Heritage online. 00:03:16.000 --> 00:03:21.000 We will have time for one or 2 questions after quinn's presentation. 00:03:21.000 --> 00:03:24.000 So again, use the Q. and A. to drop in questions, Quinn, please. 00:03:24.000 --> 00:03:31.000 The the screen is yours. Thank you so much, Chris, and thanks for having me. 00:03:31.000 --> 00:03:36.000 The The Internet Archive has been an invaluable partner on the Sucho project. 00:03:36.000 --> 00:03:40.000 Mark Graham has been in our slack answering questions, helping us debug things. 00:03:40.000 --> 00:03:51.000 Some days I feel bad at how much we're sending your way with the servers. but it's been an absolutely essential resource for making this happen. 00:03:51.000 --> 00:04:06.000 So what I have for you today i'm calling libraries laboratories and back alleys, recognizing that many of you are coming from universities with sort of traditional libraries that may be exploring kind of What a laboratory model might look 00:04:06.000 --> 00:04:13.000 like within the library when it comes to web archiving and other kind of innovative tools and methods slide here. 00:04:13.000 --> 00:04:19.000 So I really love this analogy of libraries and laboratories. 00:04:19.000 --> 00:04:22.000 You know, both of those institutional structures have things in common. 00:04:22.000 --> 00:04:27.000 They both have processes and procedures that are there for a reason. 00:04:27.000 --> 00:04:37.000 Libraries, you know, take the long view with things they need to be selective, and in what they choose to archive, and how they need to think about storage, not only for today, but also for tomorrow. 00:04:37.000 --> 00:04:44.000 And and there are. there are policies and procedures in place to keep this happening and keep it safe, you know. 00:04:44.000 --> 00:04:49.000 Similarly with laboratories. you can't just let anyone walk in from the street into a laboratory. 00:04:49.000 --> 00:05:02.000 You know there's the occupational safety and hazards and regulations and and things to make sure that the research that is done is is done the right way; that it supports reproducible research that the the findings are 00:05:02.000 --> 00:05:13.000 sound and solid. But then there's back alleys and and that's where i've been finding myself for the last 2 weeks with my Sutcho co coordinators and a Kiosk and 00:05:13.000 --> 00:05:27.000 sebastian maestrovitch we we the production that we've that we've put together reminds me of some of the the images that we've seen coming out of Ukraine with regular people who you know 00:05:27.000 --> 00:05:41.000 have ordinary jobs, their their teachers, their doctors, their students. and And you know whole communities are chipping in to provide resources, you know, to to build supplies to support the war effort. 00:05:41.000 --> 00:05:50.000 It feels a little bit like a community. Molotov cocktail factory in a back alley with everyone shipping in, however, they can. 00:05:50.000 --> 00:05:57.000 So for the last 2 weeks we, the group, only started March the first in in the course of 2 weeks. 00:05:57.000 --> 00:06:02.000 We now have over 1,200 volunteers we've captured over 2,300 websites. 00:06:02.000 --> 00:06:16.000 We've uploaded. more than 14,000 documents and images and related materials that we've we've gotten from those websites to the Internet archive. And we're also working with the web recorder software for people to 00:06:16.000 --> 00:06:20.000 create their own high fidelity web archives on their own computers. 00:06:20.000 --> 00:06:25.000 And not only computers. We also have enlisted raspberry pi's into the operations. 00:06:25.000 --> 00:06:29.000 We've been working with so many different people and devices and hardware. 00:06:29.000 --> 00:06:35.000 People are running things on their own servers. All of us are trying to capture things as quickly as possible. 00:06:35.000 --> 00:06:39.000 And this is this is necessary in a state of emergency. 00:06:39.000 --> 00:06:45.000 We. We are not associated formally with any institution with any sort of infrastructure. 00:06:45.000 --> 00:06:58.000 We're being supported by folks from the Internet archive we're being supported by the institutions that we work for, but be able to be able to act in the capacity of volunteers who can work quickly and outside of the kind of 00:06:58.000 --> 00:07:11.000 more formal and and thoroughly thought through university structures, because there simply isn't time to go through all of the the hoops and procedures and committee meetings necessary to do this sort of in a university context 00:07:11.000 --> 00:07:18.000 So I think the question I have for for the community is whether library laboratories 00:07:18.000 --> 00:07:23.000 Within a university and institutional context can reduce the need for Molotov cocktails. 00:07:23.000 --> 00:07:28.000 Moving forward you know it's it's a amazing effort with our whole volunteer community. 00:07:28.000 --> 00:07:39.000 But can we put some safeguards in place moving forward for capturing digital cultural heritage on ongoing basis before it's an emergency so that we don't need to have as many Molotov 00:07:39.000 --> 00:07:47.000 cocktails to make this work. And and, furthermore, how can library laboratories offer meaningful guidance and support for emergency efforts? 00:07:47.000 --> 00:08:02.000 How can universities and folks who've been working with web archives, you know, share the things that they've learned with people who are coming into this space in spirit of abandoning together an emergency to get things done to help keep us 00:08:02.000 --> 00:08:07.000 clear out of dead ends, and and to you know, point us towards 00:08:07.000 --> 00:08:13.000 You know tools and methods and approaches and techniques that can help our work go faster. 00:08:13.000 --> 00:08:16.000 This is not a project that any of us want to do. 00:08:16.000 --> 00:08:19.000 This is not a project that we hope will ever be used. 00:08:19.000 --> 00:08:30.000 We, we would like nothing more than for none of these web archives to be necessary in the end for all the servers to remain up and uncompromised, and for the people maintaining them to be safe. 00:08:30.000 --> 00:08:45.000 But in in case that doesn't happen we're all glad to to be here and to be contributing to to making sure that that some form of this crucial digital cultural heritage is captured accurately moving 00:08:45.000 --> 00:08:55.000 forward. Thank you. Thank you, Quinn. So we do have a couple of questions that have been submitted. 00:08:55.000 --> 00:09:08.000 And when we have time for those so if you're if you're open to this, what's the biggest challenge that you're facing that you faced here in these first 2 weeks it's time I mean time on on multiple angles on 00:09:08.000 --> 00:09:14.000 one hand. you know we've we've seen sites go down shortly after we finish archiving them. 00:09:14.000 --> 00:09:19.000 There are sites that we've missed so that's that's, you know. it is it is a race against time. 00:09:19.000 --> 00:09:32.000 We have folks monitoring the situation on the ground and and we We update things, you know, based on where we see that there are, you know, active fighting, active bombing, and where servers might might be compromised. 00:09:32.000 --> 00:09:43.000 And you know there's also time on the other side of you know it takes time to on board volunteers. you can't you can't just throw tutorials at people and be like have fun sometimes that works 00:09:43.000 --> 00:09:57.000 but often it doesn't and so dina strong and and other volunteers in our community have been putting an incredible amount of effort to answering questions pointing people to the right place to start running live sessions kim Martin has 00:09:57.000 --> 00:10:06.000 been amazing organizing a very well organized as you would imagine a metadata team to to enhance the materials that have been uploaded to the Internet archive. 00:10:06.000 --> 00:10:11.000 So it's it's it's the pressures of time on on both ends. 00:10:11.000 --> 00:10:16.000 Great another question: Here are you archiving social networks, content like telegram or others? 00:10:16.000 --> 00:10:23.000 There are other initiatives that are working on web arch, having more broadly, I believe, including social media. 00:10:23.000 --> 00:10:26.000 We are very specifically focusing on cultural heritage institutions. 00:10:26.000 --> 00:10:40.000 So libraries, archives, museums, you know, ballets, you know things things that represent kind of Ukrainian culture and cultural heritage that that people have invested the time and effort to to put online and share with the 00:10:40.000 --> 00:10:42.000 world. We have time for one more question here in this one. 00:10:42.000 --> 00:10:46.000 You kind of a touched on it a little bit. 00:10:46.000 --> 00:10:51.000 But how has this new priority project affected your regular day-to-day work? 00:10:51.000 --> 00:11:05.000 I no longer have regular day-to-day work, and I am incredibly grateful to Stanford libraries for making it possible for the division of literature's, cultures, and languages where I work. 00:11:05.000 --> 00:11:10.000 People understand right now that this is an absolute emergency. We need all hands on deck. 00:11:10.000 --> 00:11:15.000 You know the Stanford libraries has supported various my colleagues working on this as well. 00:11:15.000 --> 00:11:22.000 I. I look forward to getting back into computational text analysis of literature in many languages at some point in the future. 00:11:22.000 --> 00:11:25.000 But but right now it's it's honestly day to day. 00:11:25.000 --> 00:11:29.000 Yeah, Well, thank you, Quinn, for for doing the work. 00:11:29.000 --> 00:11:31.000 Thanks for coming here today, sharing a little bit about this project. 00:11:31.000 --> 00:11:37.000 I see the Duncan has dropped a link back to the to this. How do you pronounce it? 00:11:37.000 --> 00:11:40.000 Are you pronouncing it? Cho website? Great. 00:11:40.000 --> 00:11:44.000 Thank you. And again, Quinn, thanks So much for for your time and attention here today. 00:11:44.000 --> 00:11:47.000 We all really appreciate it. And for The work that you're doing. 00:11:47.000 --> 00:11:51.000 Yeah, Thank thanks. Thanks. The Internet Archive for the support for the project. 00:11:51.000 --> 00:11:58.000 Glad to do it. Thanks so much. And now we're going to hear from 5 innovative web archiving projects from archives unleashed. 00:11:58.000 --> 00:12:07.000 So the remainder of our talk today will be with the Archives unleashed team and the the cohort program with the the 5 research projects that we're gonna feature today. 00:12:07.000 --> 00:12:14.000 So San Fritz is here today from archives unleashed to talk about this inaugural cohort program. 00:12:14.000 --> 00:12:23.000 Then what we're a question or 2 after each presentation and then we'll have open Q. 00:12:23.000 --> 00:12:28.000 And a with at the end. with our remaining time again. 00:12:28.000 --> 00:12:35.000 We do run for 90 min today and knowing that this is the zoom world that we live in no worries, If you need to move on to another meeting again. 00:12:35.000 --> 00:12:38.000 Everyone who's registered will get a link to the session recording tomorrow. 00:12:38.000 --> 00:12:45.000 So if you need to leave at the top of that hour or even before to make it to your next meeting, you're you're not gonna miss a thing. 00:12:45.000 --> 00:12:51.000 So with that over to you, Sam. thanks so much chris for the introduction. 00:12:51.000 --> 00:13:04.000 I'm just going to share my screen here and a special thanks to everybody who's been hard at work to make this series happen. 00:13:04.000 --> 00:13:21.000 It's a pleasure to be here today. sorry for the technical difficulties there. 00:13:21.000 --> 00:13:32.000 All right. No worries. It all looks great perfect thanks so thanks so much to everybody in the audience who's joining us today. 00:13:32.000 --> 00:13:45.000 We're really excited to take part in this series especially you know and highlighting the ways that researchers are engaging with services and resources offered through the Internet Archive. 00:13:45.000 --> 00:13:54.000 And so for archives unleashed. This is mean. This means celebrating the research collaborations and the data intensive projects that have been developed through a cohort program. 00:13:54.000 --> 00:14:10.000 So, as Chris mentioned i'm San fritz project manager for archives unleashed, and I make up one over 8 of our very interdisciplinary team, we have investigators who bring a wide expertise from 00:14:10.000 --> 00:14:23.000 libraries and archives, digital humanities, computer science and web engineering, and everything in between and as Chris mentioned at the top, you know, the series approaches a few different questions for our session today. 00:14:23.000 --> 00:14:33.000 We're going to focus on what you can do with web archives, or you know what are some of the applications or case studies that use web archival data for research, hey? 00:14:33.000 --> 00:14:36.000 Sam. i'm sorry to interrupt but we're getting a couple of messages. 00:14:36.000 --> 00:14:43.000 It looks like you have a video bar so close to where your screen your slides are. 00:14:43.000 --> 00:14:47.000 Yeah, if you wouldn't mind moving that then we can see all of the slides. 00:14:47.000 --> 00:14:51.000 Okay, same. and Zoom does. it makes it difficult to rearrange that bar. 00:14:51.000 --> 00:14:59.000 Yeah, it it can be a challenge. Oh, goodness okay, it's this better. 00:14:59.000 --> 00:15:03.000 It keeps popping up. Yeah, you can just you you can minimize it. 00:15:03.000 --> 00:15:06.000 If it's okay with you just to use the there's a little minus button. 00:15:06.000 --> 00:15:14.000 Next to the yeah, Yeah, that'll be great okay perfect sorry about that. 00:15:14.000 --> 00:15:23.000 Great. so to help us ground us in this question of you know, what can we do with web archives? 00:15:23.000 --> 00:15:28.000 I'm gonna provide some context around the archives and lege project and the cohort program. 00:15:28.000 --> 00:15:42.000 So, for those that are new to our project. Archives, unleashed, was formally established in 2,017, with funding from the Andrew Mellon foundation and the project recognizes and appreciates the critical role that web archives 00:15:42.000 --> 00:15:47.000 play for scholars who are studying a variety of topics from the 1990 S. 00:15:47.000 --> 00:15:53.000 Onwards. And so this guiding principle for our team has always been, you know. 00:15:53.000 --> 00:15:57.000 How do we lower the barriers of entry to access and use of web archives? 00:15:57.000 --> 00:16:12.000 So during the first phase of our project, the team has focused in on 3 main activities the first being to develop open source and user-friendly tools that allow folks to conduct scalable analysis. 00:16:12.000 --> 00:16:18.000 And so this has involved developing the archives on each toolkit and cloud alongside. 00:16:18.000 --> 00:16:22.000 Exploreratory methods, such as using Jupiter notebooks. 00:16:22.000 --> 00:16:33.000 Our second priority was to create resources like user documentation, learning guides and tutorials to help address some of those technical components of our tools. 00:16:33.000 --> 00:16:39.000 But it also. they were also created to inspire confidence and encourage exploration of web archives. 00:16:39.000 --> 00:16:56.000 And then, finally, we paid attention to buildings and engaging a community of users. and this was around hosting a series of dataathons, not only for skill, development, but also to follow the sense of belonging and support through things like collaboration and 00:16:56.000 --> 00:17:12.000 partnerships. So our work continues into a second phase, and in 2,020 this led to a partnership with the Internet archive and our archive collaborators, and here we have 2 priorities. 00:17:12.000 --> 00:17:22.000 So the first is to develop an analysis platform which essentially reimagines the archives and leash tools and sees them integrated within the archipid environment. 00:17:22.000 --> 00:17:28.000 And our goal here is to provide this end-to-end process for collecting and studying web archives. 00:17:28.000 --> 00:17:35.000 And here we have arch. The second priority draws upon an expands. 00:17:35.000 --> 00:17:42.000 Our foundations of community building. And so our aim here is to focus on opportunities that engage with web archives. 00:17:42.000 --> 00:17:47.000 Research, but in a more intensive way than our dataathons allowed for. 00:17:47.000 --> 00:18:03.000 And so these 2 priorities are very much intertwined, and I would just like to take a brief moment to describe Arch as a way of grounding the work and the resort research from that you're going to hear about from our 00:18:03.000 --> 00:18:20.000 cohort panelists today, and as I talk i'm going to play this very short video which kind of gives you a bit of a guided tour through our interface. So arch simply put is a platform for interacting with and can 00:18:20.000 --> 00:18:33.000 conducting analysis on web archive collections, and as part of our integration work, arch presents this similar look and feel to other archive research services that partners would be very familiar with. 00:18:33.000 --> 00:18:43.000 The platform provides an entry point for conducting in-depth analysis, and we see it as a way for unlocking research potential of these web archive collections. 00:18:43.000 --> 00:18:46.000 And so here you'll you'll see a little bit of a tour. 00:18:46.000 --> 00:19:01.000 It provides the ability to generate over a dozen data sets that researchers that can then take and do more in-depth analysis on. And it also presents some simple embowser visualizations for researchers just to 00:19:01.000 --> 00:19:09.000 get that that initial deep dive into it and start to see what's in the Web archive collection, and some of the questions they can start to ask of it. 00:19:09.000 --> 00:19:18.000 And so in terms of the data sets that are offered we've categorized them into these 4 working themes. 00:19:18.000 --> 00:19:29.000 The collection data sets provide this basic overview of a collection, and I think a lot of our cohort groups have used that as an entry point network. 00:19:29.000 --> 00:19:34.000 Data sets can be used to explore the ways that websites link to each other. 00:19:34.000 --> 00:19:38.000 Then users can also explore text components of a web archive. 00:19:38.000 --> 00:19:45.000 So this would include plain text of a web, page, text, file, information as well as named entities. 00:19:45.000 --> 00:19:54.000 And then, finally, the file formats provides an opportunity to explore information on the binary files within a web archive. 00:19:54.000 --> 00:20:06.000 And so now we turn to the cohort program. And so in in in other understanding the logistics that are involved, we again return to this question: of what can you do with web archives? 00:20:06.000 --> 00:20:14.000 So our goal in launching this program is to foster and support research and engagement with web archives as scholarly objects. 00:20:14.000 --> 00:20:26.000 And so our cohort research teams have engaged in this year-long, intensive collaboration, or almost a year, as of now, to conduct investigations that use web archival data. 00:20:26.000 --> 00:20:34.000 We've built supports into the program that include things like mentorship from the archives unleash team. 00:20:34.000 --> 00:20:39.000 So this comes from leadership from ian milligan and Nick Rueway. 00:20:39.000 --> 00:20:50.000 We've also connected teams with individuals who have specialized knowledge and expertise that have been very helpful in addressing some of the roadblocks that we've encountered. 00:20:50.000 --> 00:21:02.000 So This includes contributions from Jimmy Lynn and one of our lead developers on Arch Helga Holtzmann, and we've also provided opportunities for Peer-to-peer support for groups to be able 00:21:02.000 --> 00:21:09.000 to share some of their best practices and their resources and experiences as they've gone through the program. 00:21:09.000 --> 00:21:25.000 To tie us back into arch our cohort. groups are essentially the first users to pilot Arch, and it's been a very rewarding experience both for the researchers, because they are able to you know fairly easily and quickly 00:21:25.000 --> 00:21:40.000 generate these data sets for analysis, but also for our team as we've been able to iteratively develop a platform that continually assesses the needs of users and I would also just like to finally point out here that an 00:21:40.000 --> 00:21:45.000 additional benefit of this program has been to help increase the use. 00:21:45.000 --> 00:21:51.000 The access and the visibility of the curated collections by archivist partners. 00:21:51.000 --> 00:22:02.000 Oh, sorry, and I will just really quickly point out here the if you, if you are interested in starting a research program. 00:22:02.000 --> 00:22:09.000 A research project. We are currently accepting applications for our second round of cohorts. 00:22:09.000 --> 00:22:22.000 So if that's of interest, I would point you towards or program web Page, and of course please feel free to connect with our team with any Questions that you might have 00:22:22.000 --> 00:22:27.000 And so finally, i'd like to introduce you to our cohort researchers that we're going to be talking with today. 00:22:27.000 --> 00:22:38.000 So these research teams span across North America and Europe and they've selected a wide range of topics to study Speaking on behalf of our project team. 00:22:38.000 --> 00:22:50.000 It has been incredibly inspiring to see these teams use very creative approaches and methods to see the way that teams have transformed challenges into learning opportunities. 00:22:50.000 --> 00:22:55.000 And ultimately you know us witnessing these projects take shape. 00:22:55.000 --> 00:23:06.000 So I truly hope that by the end of this fashion and this discussion everybody here will see the the truly innovative applications of Web Archives research. 00:23:06.000 --> 00:23:14.000 You know, not only within the digital humanities field, but you know in adjacent and and overlapping fields as well. 00:23:14.000 --> 00:23:23.000 So with that i'm pleased to introduce the researchers from our cohort program that join this panel today to speak on their research journeys. 00:23:23.000 --> 00:23:33.000 We have Timber Barrett from Brock University, Shannon Mcdonald from the University of Waterloo; Robert Jansba from the University of Zion; Valerie Schaefer from the University of 00:23:33.000 --> 00:23:38.000 Luxembourg and Sean Walker from Arizona State University. 00:23:38.000 --> 00:23:48.000 So I think, before we jump into the projects themselves we are taking a moment to for questions. Yeah. 00:23:48.000 --> 00:23:55.000 So, Sam, Thanks. one quick question for you. before we go into the to the to the cohorts. 00:23:55.000 --> 00:24:00.000 But how did you get interested in web archiving so honestly? 00:24:00.000 --> 00:24:16.000 Before I jumped on this project, I hadn't worked with web archives. but I had had experience in learning about the projects that both Ian and Neck were involved in, and it just it seemed like such an amazing and Supportive community that was very 00:24:16.000 --> 00:24:22.000 well situated in in a number of different communities, right like with libraries, with archives, with researchers. 00:24:22.000 --> 00:24:41.000 So I think that for me was the most interesting. point was just connecting all of these different communities based on the activity of web archiving. That's great, seeing no other questions. I'll turn it back to you, and we can move on with with the next 00:24:41.000 --> 00:24:55.000 speaker perfect. So i'm pleased to introduce the the first speaker to Barbaric, who comes to us from Brock University, and he's going to be talking about crisis communication in the Niagara region during 00:24:55.000 --> 00:25:00.000 the Covid 19 pandemic. So over to you, Tim. 00:25:00.000 --> 00:25:05.000 Thanks very much, Sam. So just one moment i'll share my slides here. 00:25:05.000 --> 00:25:26.000 Okay, So, as mentioned, my name is Timor Barrack i'm from the Brock University in the Niagara region, and our project we were investigating and focusing on crisis communication, and how it was conducted in the 00:25:26.000 --> 00:25:38.000 Niagara region during the course, of the pandemic. So I'm going to first describe the data set how we prepared it, and then get into our research question, and the few different venues that we've been exploring this 00:25:38.000 --> 00:25:45.000 idea with Project website will be shared in the in the in the chat box and dude course. 00:25:45.000 --> 00:25:48.000 So feel free to check all of our blog posts about successes. 00:25:48.000 --> 00:26:10.000 So far. Okay, so aha! There we go. So, starting in early pandemic days, the archives at Brock University created a seed list and some scrapes of certain data sets are certain websites pardon me having to do with 00:26:10.000 --> 00:26:16.000 institutions in the Niger region that dealt with Kovat and the pandemic. 00:26:16.000 --> 00:26:21.000 So you can think of municipal governments, community organizations. 00:26:21.000 --> 00:26:38.000 And another thing we found particularly interesting was different wineries and other locations in the Niagara region that were forced to sort of change production from, you know, spirits to you know, alcohol for cleansing and stuff like that so all of this 00:26:38.000 --> 00:26:50.000 chronology, and all these these incidences or all of these things part of me are are kept track of in this archive so approximately 56 different organizations, and we're actually I think it's closer to 400 00:26:50.000 --> 00:26:58.000 Gigabytes of data so it's not huge but it's certainly not something you want to sort through on your own directly. 00:26:58.000 --> 00:27:13.000 So much like sam has described, we're using arc to extract the full text That's what we were the most interested in of all of these websites, and then prepare them in a way that we can reflect form some analysis on 00:27:13.000 --> 00:27:19.000 them. So we we ended up, or the direction we went in is sort of twofold. 00:27:19.000 --> 00:27:34.000 We created a solar worklight search interface that sits on top of the whole archive, so that the researcher was performing the The research can do some some distant reading and look for trends, using the the whole contents of 00:27:34.000 --> 00:27:49.000 the archive. But then the other thing we've been spending a lot of time on is analyzing and creating for further derivatives of the full text that we got from arc into different environments such as Google coal app So once we were 00:27:49.000 --> 00:27:54.000 able to sort of take a look at this large archive, pull out some information out of it. 00:27:54.000 --> 00:28:08.000 We could then polish it up and put it into a Jupiter notebook environment to hand over to the researcher who is performing the investment and not have them worry about the technical overhead or the idea of working with large bits 00:28:08.000 --> 00:28:12.000 of data. We have notebooks that do all the important parts. 00:28:12.000 --> 00:28:16.000 So all told. This is how it all sort of breaks down. 00:28:16.000 --> 00:28:29.000 We have our archive and it's form we took a copy of that into solar work light, and we use that as a search interface to find trends and other bits of evidence. And then, on the other side of that or 00:28:29.000 --> 00:28:45.000 Conversely, we have the arc full text derivatives that we've been manipulating, changing and then putting into a Google collab environment where we perform more computational inspired a text analysis and other forms of analysis So that's 00:28:45.000 --> 00:28:58.000 the stage set with the data set. So how about the research question so what we're going to do, or what we're in the process of, I should say, is investigating this idea of crisis communication. 00:28:58.000 --> 00:29:07.000 And anytime a natural disaster happens or there's an outbreak, or there's you know, adverse activity in the environment or in the community. 00:29:07.000 --> 00:29:18.000 The the study of crisis communication is assessing how well those communication surrounding those crises are broadcast to different. 00:29:18.000 --> 00:29:28.000 You constituents. So basically all told that comes down to these 4 different facets, resilience, education, trust in engagement. 00:29:28.000 --> 00:29:42.000 So what we're trying to do, then is take a look at the information that was produced by these different organizations during the course of the pandemic, to see how well they rank more or less along these 4 different facets 00:29:42.000 --> 00:29:53.000 then put more specifically, you know, can we determine how well the organizations in the Niagara region performed in these 4 dimensions during the course of the pandemic? 00:29:53.000 --> 00:30:01.000 So how do we actualize this or like what's once one thread that we're investigating so where we're situated? 00:30:01.000 --> 00:30:08.000 Is in the southeastern part of ontario and we're sort of physically geographically bound. 00:30:08.000 --> 00:30:20.000 We're a peninsula. So we have this really nice rigid structure where everyone kind of knows what the Niagara region begins, and we also have this added complexity of another municipal body called the niagara 00:30:20.000 --> 00:30:25.000 Regional Council, which is an elected body from representatives of all the different municipalities. 00:30:25.000 --> 00:30:38.000 And so there's this push pull between the municipalities, the cities and towns locally, and then the Regional Council and one of the things we're trying to investigate is the how messages regarding 00:30:38.000 --> 00:30:46.000 Covid, where, broadcast in the Niagara region did it go to the Niagara Region council first, and then down to the municipalities? 00:30:46.000 --> 00:30:53.000 Was there any back and forth the regional council is officially the Public Health Board of the Region. 00:30:53.000 --> 00:31:09.000 But that doesn't mean you know all municipalities were at lockstep with what the regional Council was trying to broadcast. So we're trying to use some computational methods to see if the the terminology 00:31:09.000 --> 00:31:22.000 and text that the municipality, or the regional Council used, if it's being adopted by the municipalities in a in a in a similar way, or if it was rift or changed or or sort 00:31:22.000 --> 00:31:34.000 of made up it different. it's been proving to be a pretty interesting line of research, So far, So that's the research question and the research that we're trying to do we're also incorporating or investigating 00:31:34.000 --> 00:31:45.000 this data set use as a teaching opportunity or something to be used in class along with a Jupiter notebook to sit on top of it. 00:31:45.000 --> 00:31:55.000 So we wanted to see how well notebooks can be used for teaching, and more specifically are they intuitive enough to use in a classroom environment with learners that don't know how to code. 00:31:55.000 --> 00:32:09.000 So this brought us to an activity we did in the fall, where we had a fourth year communication class for communication for professionals, and we had a subset of this data that we were showing them in various different ways produced in a 00:32:09.000 --> 00:32:18.000 notebook, having them sort of look at the quantitative and qualitative markers of this data to have a discussion on. 00:32:18.000 --> 00:32:32.000 You know how well it supposed to do or how well it Did what it's supposed to do, and we also put together a all these different threads we pulled together into a workshop. 00:32:32.000 --> 00:32:40.000 That will be part of the regular workshop series that the Digital Scholarship Lab at Brock University offers, and it was simply called an analyzing web archives. 00:32:40.000 --> 00:32:47.000 Or we took all these disparate pieces, put it together in a 90 min session that we thought would be useful for attendees. 00:32:47.000 --> 00:32:51.000 Various different backgrounds, and, you know, scattered throughout the world. 00:32:51.000 --> 00:32:55.000 So we'll just say we're at the stage now where we're just putting our manuscript together. 00:32:55.000 --> 00:33:08.000 We're also looking at how we can bolster and take a look at using notebooks for teaching and research what's really great. and i'll just one quick last anecdote researcher who's primarily doing 00:33:08.000 --> 00:33:12.000 the manuscript preparation is not what you would call a computationally inspired researcher. 00:33:12.000 --> 00:33:24.000 He enjoys distant reading, but we were able to put together a data set and some tools that will allow him to sort of work in a way that he's comfortable all the while avoiding any sort of necessary 00:33:24.000 --> 00:33:28.000 jargon, and very deep sort of technical knowledge of how to manipulate text. 00:33:28.000 --> 00:33:38.000 And so far it's going pretty Well, and as I mentioned everything's tracked on a website, and there's links to our notebooks and all sorts of things there so i'll stop there and entertain any questions that might 00:33:38.000 --> 00:33:49.000 come up. Thanks So much for that, Tim. We do have a question here which is, Does your research, or do you think your research is applicable to other municipalities? 00:33:49.000 --> 00:33:52.000 Or is there something specific to the Niagara region? 00:33:52.000 --> 00:34:07.000 Council. Oh, great question. Yeah, I the dynamic we're certainly exploring is very much on this Niagara Regional Council versus the municipalities, and I think in our you know, in Ontario. 00:34:07.000 --> 00:34:14.000 Where we live there's only maybe one other example of a regional council, and then, and how it compares to against municipalities. 00:34:14.000 --> 00:34:19.000 So it might be too specific to be generalizable with that particular line. 00:34:19.000 --> 00:34:25.000 However, it's interesting to see jurisdictional developments. So I think that part of the lesson can be expanded in either direction. 00:34:25.000 --> 00:34:33.000 So yeah, I would say that so great. another question. Are you finding collabor, robust platform for sharing your work? 00:34:33.000 --> 00:34:42.000 Absolutely. The only particular necessity of Collab is needing a Google account which is ubiquitous these days. 00:34:42.000 --> 00:34:45.000 And if you structure your notebooks correctly, you can pull in a data. 00:34:45.000 --> 00:34:52.000 Sets have all the analysis get performed. and Then the end researcher is abstracted from all the details. 00:34:52.000 --> 00:35:00.000 There's lots of discussion, and examples, I'm happy to sort of share on our website to you know, Bolster, that fact for sure. that's great. 00:35:00.000 --> 00:35:05.000 Thank so much, Tim, in the interest of time if if folks in the audience have additional questions. 00:35:05.000 --> 00:35:16.000 We'll have an opportunity at the end for additional questions with for 10, but would like to ask Sam to come back and introduce our next our next speaker. 00:35:16.000 --> 00:35:29.000 All right. So up next we're gonna hear from Shauna Mcdonald from the University of Waterloo, and she's going to talk to us about her project on entitled Everything old is New again. 00:35:29.000 --> 00:35:35.000 The contemporary analysis of feminist media tactics between second fourth weights. 00:35:35.000 --> 00:35:41.000 So, Shauna, the floor is yours. Thanks. Am I unuted? 00:35:41.000 --> 00:35:46.000 . Yes, okay. I'm gonna share my screen here. 00:35:46.000 --> 00:35:52.000 Okay, and you can see the slides. Are you seeing just slides? 00:35:52.000 --> 00:35:57.000 Or are you seeing my whole desktop just as well, that's great amazing. 00:35:57.000 --> 00:36:02.000 Okay, great. So thanks so much for having me here i'm really excited to be able to talk about the the work. 00:36:02.000 --> 00:36:07.000 And so yeah, today i'm going to speak about doing feminist research on the Web. 00:36:07.000 --> 00:36:19.000 And I think we're a really good case study because for those we're curious about web archives. but might not hold the most technological knowledge or coding skills largely my research team is humanity scholars just entering 00:36:19.000 --> 00:36:25.000 the realm of web archiving I would say that we're accidental archivists in the sense that we've been documenting digital culture for probably a decade. 00:36:25.000 --> 00:36:29.000 But kind of in our own ways that are very much Low-fi. 00:36:29.000 --> 00:36:40.000 And so we had quite the learning curve on this project and we're immensely grateful for the mentorship that we've been given. and so I highly recommend for people who may be just entering this world that this is an excellent program to be a part 00:36:40.000 --> 00:36:44.000 of. So i'm gonna show you a little bit about what we did, and it'll seem a bit entry level. 00:36:44.000 --> 00:36:47.000 But I think the conceptual questions that came out of the research are really important. 00:36:47.000 --> 00:36:59.000 So we're the feminist think tank I co-lead this and we're at the University of Waterloo. And we explore digital media activist practices online and kind of think those through in a way that 00:36:59.000 --> 00:37:07.000 we want to map them with outstanding media tactics from previous eras and generations, and wanted to make that dialogue happen. 00:37:07.000 --> 00:37:15.000 Our goal is to always have accessible conversations about feminism and the variety of media tactics that it has that has been used over time. 00:37:15.000 --> 00:37:26.000 And these are our kind of 4 goals within the project and one of those is to develop accessible digital research methods based on the principles of data Feminism, where we're always thinking through the types of power structures that are in 00:37:26.000 --> 00:37:37.000 viewing all technologies and design and that's where this project that we're doing that i'm talking about today comes in so work with arch this year. 00:37:37.000 --> 00:37:41.000 Had a couple of main objectives, but i'm going to speak to the second one. today. 00:37:41.000 --> 00:37:47.000 We wanted to in the project map and analyze the presence of feminist key concepts in the archives that we were looking at. 00:37:47.000 --> 00:37:55.000 And the reason why we wanted to do this was because we realized that there was a lot of really important key terms that had been used and spoken about, since. 00:37:55.000 --> 00:38:00.000 Say the 1,900 sixtys and seventys. but they meant something very different in a kind of fourth wave. 00:38:00.000 --> 00:38:03.000 Digital media context. And so we wanted to think about those histories a little bit. 00:38:03.000 --> 00:38:08.000 And also consider what has been overlooked or erased within our archival spaces. 00:38:08.000 --> 00:38:12.000 Around these key terms, and who is using them, and who gets to use them? 00:38:12.000 --> 00:38:19.000 And why so? Ultimately we decided on a particular list of archives, and this became our final list right here. 00:38:19.000 --> 00:38:25.000 That i'm that i'm showing you these are the ones that we will be looking at in this project, and I'm only going to speak about one today. 00:38:25.000 --> 00:38:34.000 But these are the entirety of them. So to look through these, and to determine this particular list, we've started by just simply going on our private and putting in the word feminism and seeing what came up. 00:38:34.000 --> 00:38:46.000 And then we got to these and began to look through them on the arch platform to look at things like domain capture to see what the top 10 websites are crafty by each collection work, and that's how we determine this 00:38:46.000 --> 00:38:53.000 particular set, because we're like well, this is an interesting kind of set of combinations of websites that would be really good to explore further. 00:38:53.000 --> 00:39:04.000 So from there we we moved to text files, which was an easy file for us as entry level technology people to work with. and we started our key word search in the text files. 00:39:04.000 --> 00:39:08.000 And this seemed really straightforward, but it actually took us a long time to determine this. 00:39:08.000 --> 00:39:15.000 And this is where we learned to do a bit of coding which was really, really empowering and important for us and open up a lot of new research questions. 00:39:15.000 --> 00:39:23.000 Okay, let me. Oh, yeah, there we go. So these were the original key terms that we had imagined would be useful in thinking about feminist histories. 00:39:23.000 --> 00:39:36.000 And so this is that we started with but once we got to know our collections better. We began to expand the collect the the set of terms a little bit further, and this is the step that we ended up with, and we I'm showing you 00:39:36.000 --> 00:39:39.000 with the asterisk, because this is how we ended up coding. 00:39:39.000 --> 00:39:47.000 It was with these sort of open-ended spaces. And so what we did here is we took these key terms, and then we in each web archive. 00:39:47.000 --> 00:39:54.000 We did. word camps. so very simple code of word counts in the text file, and we want to see how many times these words appeared in each collection. 00:39:54.000 --> 00:40:01.000 But then we started to do this kind of two-word intersecting search where we'd be like? Where do the words Labor and resistance come together? 00:40:01.000 --> 00:40:05.000 How many times are both of those words found in each collection. 00:40:05.000 --> 00:40:10.000 And so we wanted to see if there was any any insights that could be offered from that. 00:40:10.000 --> 00:40:20.000 And so today I'm going to just briefly talk a little bit about the one collection which is the me to collection from the Schlesinger Harvard Library, and this is a really great one, because it's showing us a 00:40:20.000 --> 00:40:27.000 contemporary collection that was developed in real time and it's It's a pretty large clock, so it's really useful in the next 2 slides. 00:40:27.000 --> 00:40:34.000 What you'll see, which we find really interesting from our research perspective is these are the key terms out of those top 20 terms you saw. 00:40:34.000 --> 00:40:38.000 These are the top 12, and in that top 12 the top free. 00:40:38.000 --> 00:40:42.000 Were media culture and community. So we were pleased with the fact that the work community was in the top 3. 00:40:42.000 --> 00:40:46.000 But we were very surprised that Media was one, and culture was too. 00:40:46.000 --> 00:40:51.000 You look at this one where we did a graph about the ones that had more than 5,000 hits within the collection. 00:40:51.000 --> 00:40:56.000 The pool shrinks to 7 terms with Media as the most prevalent, and culture and community behind it. 00:40:56.000 --> 00:41:02.000 But you see communities almost reduced by half in terms of prevalence, so what's notable? 00:41:02.000 --> 00:41:05.000 Obviously from our perspective, is the fact that feminism is nowhere in the top 7 list. 00:41:05.000 --> 00:41:17.000 This suggests to us, we suspect that the collection has captured conversations about hashtag me 2 which comment more about the hashtag and its virality than what it's actually about and that's a really interesting 00:41:17.000 --> 00:41:25.000 insight, so to determine. If this was true, we said we ran a new set of text analysis on the terms within this collection, and we added terms that we thought would be more appropriate to the movement itself. 00:41:25.000 --> 00:41:30.000 So if you see here, we chose 7 special words, and did a word count on those? 00:41:30.000 --> 00:41:42.000 And if you look at these side by side, you can see that even here none of the 7 we selected to be maybe more appropriate to the collection actually outpace the frequency of our top 3 terms. 00:41:42.000 --> 00:41:46.000 And so we think that this is a really really interesting insight. 00:41:46.000 --> 00:42:00.000 And this has seen even more clearly here when you combine those top terms from the original list with the more specific list. And so what we think that this does prove is that you know this collection, and the fact that we think it probably 00:42:00.000 --> 00:42:11.000 represents a pretty good snapshot of the Internet, or how the Internet archived me to movement is that it does indeed explore the digital hashtag as a phenomenon as a media as a medium than it 00:42:11.000 --> 00:42:17.000 does the gender-based violence of the hashtag refers to, and the same point comes across in this. 00:42:17.000 --> 00:42:27.000 In this word cloud. and so what i'm showing you here, is someone who is learning data visualization on the fly, and that i'm doing these kind of small scale beginner interventions. 00:42:27.000 --> 00:42:30.000 But even in those we can pull a lot of kind of analytic data. 00:42:30.000 --> 00:42:42.000 So the next steps for us are to work with this really wonderful juxtaposition of images in the complexion that we're created for us by digital archivists and librarian nicer way from New York university who We are 00:42:42.000 --> 00:42:52.000 immensely thankful for him doing this, for us and what this shows you. I don't know if I can actually show you right now. but if you are to click on this, you will see that these are all images that are pulled from that collection and 00:42:52.000 --> 00:42:57.000 you can zone in on all of them, and kind of do analysis of what each of these pictures is. 00:42:57.000 --> 00:43:09.000 And we were really actually surprised when we looked at this to discover that the like the kind of keywords there's a lot happening in these images that's almost like a sanitized version of me, too, it's a lot of 00:43:09.000 --> 00:43:17.000 workplace images and things like that. and not the kind of like activism or feminism that we were assuming would be the visualization of it online. 00:43:17.000 --> 00:43:22.000 And so we're going to spend some more time asking the question, What can images tell us about this collection? 00:43:22.000 --> 00:43:26.000 And also do a little bit more work on those intersecting terms which we're gonna have to think through. 00:43:26.000 --> 00:43:31.000 How to do that a bit in terms of our own limited capacities with technology. 00:43:31.000 --> 00:43:41.000 And then, finally, we'll develop or we'll apply this method we've done with this collection, and apply to all the other 7 collections, and do comparative analyses across those so I think i'll stop there and open 00:43:41.000 --> 00:43:54.000 up to questions. Thanks so much, Jenna. a question. You mentioned a couple of times that you were doing some like entry level coding and new things. 00:43:54.000 --> 00:44:01.000 So how did you? How did you and your projects team bring your skill level up to to do the work here? Okay. 00:44:01.000 --> 00:44:03.000 So first of all, we avoided it for like 5 months. 00:44:03.000 --> 00:44:07.000 We just appointed trying to we're like there's got to be no and no there's not. 00:44:07.000 --> 00:44:14.000 So finally, when we we got brave enough, we asked the mentor team at Arch to actually just take us through some stuff. 00:44:14.000 --> 00:44:24.000 And so i'm gonna again highlight nicer way on this one for actually walking me through how to code and taking that kind of fear factor out for me, which was immensely useful. 00:44:24.000 --> 00:44:33.000 So that's the kind of mentoring that does get you know offered, and I think that's key and then from there i'm working through a couple of like easy available things on the web to teach me a little bit more about 00:44:33.000 --> 00:44:38.000 let's fantastic, so that mentorship program really does pay off. 00:44:38.000 --> 00:44:43.000 You have resources that you can draw from to to help learn and improve fantastic. 00:44:43.000 --> 00:44:49.000 Well, we'll save additional questions for the for the session at the end. 00:44:49.000 --> 00:44:53.000 And so thank you very much, and i'd like to pass back to Sam to introduce our next speaker. 00:44:53.000 --> 00:45:00.000 Perfect thanks so much, Hannah. so for our next cohort presentation. 00:45:00.000 --> 00:45:13.000 We're going to hear from Robert janswa from the University of Zeigan, and he's going to be talking about how his project has been looking at mapping and tracking the development of online commenting systems on 00:45:13.000 --> 00:45:17.000 news websites between 1,996 and 2,021. 00:45:17.000 --> 00:45:28.000 So, Robert, the floor is yours. Thanks, Sam. Yeah, let me just sure some slots with you. 00:45:28.000 --> 00:45:35.000 There we go. Okay, i'll see this fun we can't yes, all right. 00:45:35.000 --> 00:45:49.000 So yeah, our project is looking at the development of online coning systems specifically on news websites, together with a large team from across the Netherlands and Germany. 00:45:49.000 --> 00:45:58.000 We're part of a larger collaborative research center that looks at the transformations of the popular and are bit of the popular art. 00:45:58.000 --> 00:46:16.000 Those online commenting systems, what we're looking at and kind of the emphasis of our project is really online comments really getting into discussion around 2,015 2,016 where a lot of online media outlets So start to 00:46:16.000 --> 00:46:33.000 remove their commenting systems. So there are there was this disruption that happened around them, and that is really what we're trying to focus on these disruptions, since. Oh, my counting really got authorized in the early 00:46:33.000 --> 00:46:41.000 blogosphere where it was seen as a novel mode of participation and democratizing online. public. 00:46:41.000 --> 00:46:45.000 So it wasn't only a news article or a blog post itself that had any tell you. 00:46:45.000 --> 00:46:54.000 But also the users and interactors with that text that could react on it with the rise of web tuber. No. 00:46:54.000 --> 00:47:02.000 And social media. we've seen that commenting is increasingly the crop becoming problematized. 00:47:02.000 --> 00:47:09.000 So it's no longer just unwanted colors that are spam, or even trolls. 00:47:09.000 --> 00:47:15.000 But comments are actually seen as toxic and taking away something from the main text I'm. 00:47:15.000 --> 00:47:29.000 With this problemization we see arise in a commercial moderation industry, where the providers of commenting technology also offer ways to stand. 00:47:29.000 --> 00:47:37.000 The type of these unwanted comments allowing unwanted comments to be filtered out, and once it's got to be highlighted. 00:47:37.000 --> 00:47:43.000 So these disruptions are what we want to follow in our web archives. 00:47:43.000 --> 00:47:50.000 So we're following these disruptions so where we're looking is the comments themselves. 00:47:50.000 --> 00:47:56.000 So we're trying to find comments and common sections in our web archive data. 00:47:56.000 --> 00:48:03.000 Traditionally, this is both literally and figuratively at bottom half of the Internet. 00:48:03.000 --> 00:48:13.000 Usually below an article at the left. Here you can see the discus counting system that's traditionally at the bottom of an article. 00:48:13.000 --> 00:48:20.000 Sometimes also next to an article, but there are also cases where deployments are only explicitly loaded. 00:48:20.000 --> 00:48:24.000 200 when clicked on or explicitly search something. 00:48:24.000 --> 00:48:32.000 They are on a completely separate page, and this really is one of those disruptions that we're trying to track. 00:48:32.000 --> 00:48:45.000 In general, we see the comments across the web are not very well preserved, and that's mostly due to the way that they're implemented using using Javascript. 00:48:45.000 --> 00:48:53.000 They're not part of the main text itself the Html, but they're loaded in separately fired. 00:48:53.000 --> 00:49:08.000 Javascript calls, and even depending on how websites are brought. These comments can be saved. But when you're trying to play back the web page in the Internet archive way back machine, or even in solve our way 00:49:08.000 --> 00:49:15.000 back. It is this Javascript that is overriding comments that are actually preserved. 00:49:15.000 --> 00:49:20.000 The image here shows the discus phone thing system trying to load in new columns. 00:49:20.000 --> 00:49:37.000 So even if the archive holds the columns themselves, and we can see that in the data we try to visually inspect it to you can't see it because the javascript is overlaying it so in 00:49:37.000 --> 00:49:41.000 our archives. we are looking for indicators of color. 00:49:41.000 --> 00:49:55.000 This can be the trust, the comments themselves Javascript code that tries to load them in 2, but also less direct indicators like common counters or a show comments button. 00:49:55.000 --> 00:50:14.000 These can be used to find known commenting systems like discuss or others. and when we can't possibly identify a commenting system that we know of, we want to use indirect interface, So we're suggesting the presence like 00:50:14.000 --> 00:50:21.000 show comments, reply. But these can have language. Specific interpretations also. 00:50:21.000 --> 00:50:36.000 Websites use different. some use reply, some react. and we want to separate these out from the main text. So that's an additional challenge when trying to locate these commenting systems. 00:50:36.000 --> 00:50:45.000 Now we have 3 indicators for discus here i'm showing for our world news data set, which is a list of top 15. 00:50:45.000 --> 00:50:56.000 Most popular world news outlets that we have just the time fraction of to get kind of an interpretation of what commenting systems are used. 00:50:56.000 --> 00:51:02.000 And we see indicators of discus we use there, partly by the embedded code. 00:51:02.000 --> 00:51:10.000 But a discuss also has unique identifiers for each of its commenting threats, but also the common thing. 00:51:10.000 --> 00:51:25.000 Threats themselves. We look for the Html code and find that in our data set and on some pages there is overlap between these interface and we can see Discuss actually becoming popular. 00:51:25.000 --> 00:51:42.000 And slowly losing its popularity in our data set with its peak in around our expected 2,015, and afterwards less and less of our most popular news websites use disk as a commenting system. 00:51:42.000 --> 00:51:55.000 So we want to use this disruption this year of 2,015, where our apparently news websites switch from this, because to other counting systems or no commenting system at all. 00:51:55.000 --> 00:52:10.000 So that that right what happened there. So we want to use this archive data to see how the distribution of commenting technologies, how they evolve and could tellize them with use cases. 00:52:10.000 --> 00:52:22.000 So we can use our web architecture to developers of these systems, but also publishers who use these systems on their websites. 00:52:22.000 --> 00:52:29.000 There are you changed to? Why did you change? and we are expecting them to sell? 00:52:29.000 --> 00:52:48.000 Well, this new conflict system allowed us to do have different the forces or shape different practices, and we want to use that information to go back to our head and see if we can find that if we can, we iterate on this 00:52:48.000 --> 00:52:56.000 process to find how these distribution of counting performances and practices shape each other. 00:52:56.000 --> 00:53:05.000 So yeah, if you want to do some reading here, or some extra slides are some sources that you might want to see. 00:53:05.000 --> 00:53:23.000 I want to open it up for questions, thanks steve Robert Sorry looks like we don't have any questions at the moment, so maybe in the interest of time we'll Sam to come back in and introduce the next Speaker and if we have additional questions. 00:53:23.000 --> 00:53:29.000 For you we'll we'll take him at the end sounds great over to you, Sam. 00:53:29.000 --> 00:53:37.000 Perfect thanks. so much. So we're gonna switch back now for our last 2 court presentations. 00:53:37.000 --> 00:53:42.000 To focus back in on the Ipc Covid collection. 00:53:42.000 --> 00:53:53.000 And so first off we'll hear from Valerie Schaefer from the University of Luxembourg, and she's gonna be talking to us on her project and titled analyzing web archives of the 00:53:53.000 --> 00:53:57.000 Covid crisis through the Ipc novel, Coronavirus data set. 00:53:57.000 --> 00:54:15.000 So, Valerie, over to you. Yeah, I everybody. Yeah, indeed. 00:54:15.000 --> 00:54:21.000 I will talk about the our 2 team and this very long title which is related to the Covid crisis. 00:54:21.000 --> 00:54:45.000 And first of all, I want to remind you. but you all know that that Cobe crisis was a shared made it some in some ways a unique event, also leading to a lot of experiences. 00:54:45.000 --> 00:54:56.000 And around living archives, and there is a lot of heterogeneity within the web archiving that happen during this time. 00:54:56.000 --> 00:55:06.000 I give you a few example like the archiving that's a library of Congress conditional, but also for example, the bnl in La Sambo. 00:55:06.000 --> 00:55:14.000 But I could also talk about a lot of other European libraries that conducted web archiving. 00:55:14.000 --> 00:55:33.000 And there were also project that were on Collective Project with a very bottom-up approach, like the Covid Tracking project also not precise collection related, for example, to the National Library of medicine with a much precise cup and 00:55:33.000 --> 00:55:44.000 Of course I could give you plenty of of example, and within, of course, all these web archiving experiences and projects. 00:55:44.000 --> 00:55:53.000 There was this curol of libraries, of web, archiving institutions, and here there is an international player. 00:55:53.000 --> 00:56:01.000 This is a international Internet preservation consortium, which also launched a collection. 00:56:01.000 --> 00:56:24.000 Thanks to the contribution of more than 30 members or around 30 of their members, close public nominations from individual individuals and institutions, and they reacted, of course, very fast, and during March the twentieth 20 So during the beginning of the crisis 00:56:24.000 --> 00:56:41.000 we were launching a European project under the leadership of Miss Brigadier, whose university, with also Jen Winters as a copyright, which is entitled Worknet, and within this project we had a working group which was planned 00:56:41.000 --> 00:56:45.000 to study, and for a scene event, and the way they were archived. 00:56:45.000 --> 00:57:03.000 So we had thought about plenty of example. But at the beginning of the project the Covid crisis arose, and we saw that it was a very good opportunity to also observe in real time how Weber kiding was convicted Oh, data were 00:57:03.000 --> 00:57:10.000 preserved, and so on. So within this project, starting march, the twentieth 20. 00:57:10.000 --> 00:57:19.000 We had several research outputs, and so on, like conducting oral histories with web archivists having data tons and so on. 00:57:19.000 --> 00:57:30.000 And we had the feeling that of course, the Ipc collection was a unique that are set, and especially because of this international frame that they are from the beginning. 00:57:30.000 --> 00:57:34.000 This was confirmed with an interview by Fido G. 00:57:34.000 --> 00:57:53.000 Cart who interviewed Nicola Bingham on the Covid collection, and we had also several comparisons between, of course, a national web archiving and the Ipc collection to quote again a riddle she tried for example, 00:57:53.000 --> 00:58:05.000 to map the overlaps between national collection. We had access to Metadata and silly, and so on by several European institutions. And what was in the Ipc collection. 00:58:05.000 --> 00:58:19.000 And we were a bit yeah surprised that there were not so much of our labs, because, of course, National Institution had to select some urs to give them to the Ipc. 00:58:19.000 --> 00:58:36.000 So it's not, of course, exactly the same Collections But here came the call by the Court, by Sam Bayan by Nick and Geez others. And here is so okay, we want to go further into this ipc collections. 00:58:36.000 --> 00:58:53.000 This is a wonderful opportunity, for several reasons, because of, as you see, Yashtag Mentorship, because of access to plenty of data related to this Ipc collection because of a unique collaboration also at the international level with 00:58:53.000 --> 00:59:00.000 archivist with Ipc, and so on. and of course you may imagine that for us such a collection I will come back to. 00:59:00.000 --> 00:59:15.000 It is a big like big data for Ssh. So here is the members of the court, and at some point we felt that we had also to welcome another member, which is a shkun Sha. 00:59:15.000 --> 00:59:23.000 That, and is the only computer scientist in your team? and Why did we feel the need to have a computer scientist? 00:59:23.000 --> 00:59:27.000 Because it's a huge data set and we were expecting a lot. 00:59:27.000 --> 00:59:37.000 Of course we had an Id. but, for example, having to deal with 8,000,000 of lines related to the extract clinics of web pages. 00:59:37.000 --> 00:59:44.000 This is something which is very challenging, challenging for Ssh researchers. 00:59:44.000 --> 00:59:58.000 Even if some of us have computer skills. So the first part of our work was really to map this collection to have more a clear insight into the data. 00:59:58.000 --> 01:00:07.000 And here you can see, for example, the number of calls by date, or here we are also a search rated to multilingualism. 01:00:07.000 --> 01:00:14.000 I will come back to it later, but it's also very challenging when you have a mixture of plenty of languages. 01:00:14.000 --> 01:00:28.000 We have here also a first test on as a distribution through time of several topics like, for example, yeah, the to pick of a number of these people. 01:00:28.000 --> 01:00:34.000 Unfortunately a child and Covid, or cooking during Covid time and son. 01:00:34.000 --> 01:00:42.000 And then we came back to the Ipc community. We wanted also to give something back to them. 01:00:42.000 --> 01:00:57.000 They are had allowed us to explore that, I said, and we wanted, of course, to focus on more precise to Peak, and we as them to vote for their most favourite to a peak within 6 choices related to one schooling cultural 01:00:57.000 --> 01:01:04.000 heritage, and so on, and the result was that they were interested in the topic woman, Gender and Covid. 01:01:04.000 --> 01:01:23.000 And of course we were also very interested into it, because Covid was, of course, a shared experience, but also, unfortunately is, we can see that it also created some asymmetries, or made some a symmetries more visible so 01:01:23.000 --> 01:01:26.000 there are plenty of topics that maybe explore, related to women gender and covid. 01:01:26.000 --> 01:01:35.000 The case, for example, of pregnant women and vaccine body issues like air coloring, but also domestic violence. 01:01:35.000 --> 01:01:40.000 Feminism and Covid. Some women wrote on Twitter: We will never. 01:01:40.000 --> 01:01:50.000 We are drafted, the little done the new mental workload on schooling care, the case of nurse, and so and so on. 01:01:50.000 --> 01:01:56.000 Of course it's very difficult to find a woman in this archives for several reasons. 01:01:56.000 --> 01:02:07.000 Multilingualism first, but also many noises and polymic words that may appear in or, you know, search. However, we are optimistic. 01:02:07.000 --> 01:02:15.000 We make some tests related to nurse but in French because in French it's not the same word for a female or male nurse. 01:02:15.000 --> 01:02:31.000 So a fiery era was a yeah first, a test or mapping also on the French copies, domestic violence, and how it was also expressed or mapped, mirrored in web archives. 01:02:31.000 --> 01:02:37.000 The next steps is to have it, but also on the English copies, and to have it collectedly. 01:02:37.000 --> 01:02:42.000 As you may have seen, the team is also in several European countries. 01:02:42.000 --> 01:03:02.000 So we worked remotely, but on 24, and 25 of March we will have 2 days at the University of Luxembourg to work altogether on this cop use, and to give you probably new results back in a few in a few weeks 01:03:02.000 --> 01:03:10.000 2. Thank you for your attention, thank you, Valerie we don't have any immediate questions. 01:03:10.000 --> 01:03:15.000 So maybe in the interest of time, if you do have questions for Valerie, drop them here in the Q. and A. 01:03:15.000 --> 01:03:21.000 We'll pick them up at the end but i'd like to bring Sam back to introduce our next speaker. 01:03:21.000 --> 01:03:35.000 Thank you so much, Valerie. Perfect and so Now we come full circle with our cohort research projects, and it's a pleasure to introduce Sean Walker from Arizona State University who's going to be talking about 01:03:35.000 --> 01:03:43.000 viral health misinformation from geocities to Covid, 19, and, like some of the other projects before Sean's team, has blended. 01:03:43.000 --> 01:03:46.000 A number of different collections that they've looked at for their project. 01:03:46.000 --> 01:03:54.000 So sean over to you. Thank you so much, Sam. let me share my slide real quick. 01:03:54.000 --> 01:04:03.000 Here. so like Sam said, we are looking at health misinformation in general. 01:04:03.000 --> 01:04:13.000 We're looking at it in a historical context which would be do you cities as well as a kind of current context, which would be covid 19? 01:04:13.000 --> 01:04:15.000 I just want to i'm representing a number of researchers. I'm. 01:04:15.000 --> 01:04:19.000 One part of many we have faculty that are involved myself. 01:04:19.000 --> 01:04:27.000 My backgrounds in information. Science, Michael Simeon is his backgrounds in English as well as digital humanities and text analysis. 01:04:27.000 --> 01:04:32.000 Kurzweilchi. she's her backgrounds in journalism, and we have a number of students working with us. 01:04:32.000 --> 01:04:38.000 Anna is in public health as well as looking at the feature of Innovation Society. 01:04:38.000 --> 01:04:41.000 So she does a lot of work around conspiracy theories. 01:04:41.000 --> 01:04:47.000 Major is a computer science student. He's. an undergraduate, and Georgie is a PHD. 01:04:47.000 --> 01:05:02.000 Student in journalism. So as you can see we're bringing a number of fields to bear, to understand misinformation, and to understand and contextualize these different time frames our goals here of the project are to understand the narratives 01:05:02.000 --> 01:05:06.000 and linking patterns of Hiv misinformation in Geo. 01:05:06.000 --> 01:05:17.000 Cities to and to do something similar in kovat 19 surrounding these covid 19 dashboards, because these covid 19 dashboards. are they this way that we've communicated the status of covid 01:05:17.000 --> 01:05:24.000 over time throughout the Pandemic? and we're also looking at these covid 19 dashboards, and what's their kidness to Twitter? 01:05:24.000 --> 01:05:28.000 So we look at Geo cities as a way to communicate. 01:05:28.000 --> 01:05:32.000 In the early late early ninetys, mid late late 90 S. 01:05:32.000 --> 01:05:38.000 And then we look at Twitter, and then we're also interested In what information can we actually extract from these dashboards? 01:05:38.000 --> 01:05:47.000 Because many of these are broken, as you'll see in a moment So the methods that we're using we're using qualitative narrative analysis, we're doing a lot of qualitative work to understand so 01:05:47.000 --> 01:05:52.000 we're reading content, trying to understand the themes and I'm going to talk about some of those results today. 01:05:52.000 --> 01:05:56.000 We're using text analysis, so in addition to like text frequency topic analysis. 01:05:56.000 --> 01:06:04.000 So how do we understand this really large corpus of information We're talking tens of thousands of geo city pages that mention Hiv. 01:06:04.000 --> 01:06:10.000 How do we find the relevant pages that mention Hiv. which was surprisingly more difficult than we thought? 01:06:10.000 --> 01:06:16.000 I mean data cleaning is always difficult. but we're pretty shocked at how difficult it was to figure out what's related to Hiv. 01:06:16.000 --> 01:06:25.000 What wasn't related. to hiv and then we're also using social network analysis as a way to look at some of these linking patterns. And we're using a lot of computational resources. 01:06:25.000 --> 01:06:33.000 So we're using the arch. tools from the archives and leash folks, which are very powerful, and we also use local computing resources at the University provides via our research. 01:06:33.000 --> 01:06:39.000 Computing group. Now, why would we want to compare Hiv in Covid? 01:06:39.000 --> 01:06:46.000 I think one important concept is that both of these were novel viruses at the time, and we didn't know a lot about these viruses. 01:06:46.000 --> 01:06:52.000 So information was emergent, information was changing, and so there was an attempt, even though the science was ongoing. right. 01:06:52.000 --> 01:06:56.000 We do have different politics that are at play in Hiv versus in Covid. 01:06:56.000 --> 01:07:00.000 But we do see that initially. We didn't really understand a lot about Hiv. 01:07:00.000 --> 01:07:03.000 We didn't understand how it was transmitted we didn't understand what communities were affected. 01:07:03.000 --> 01:07:08.000 We did understand treatment, and you can see a similar analog with respect to Covid. 19. 01:07:08.000 --> 01:07:11.000 It was a similar novel virus that we needed to understand. 01:07:11.000 --> 01:07:21.000 So as people fill in, gaps, misinformation and disinformation become tools to fill in some of those gaps as well as tools to cause trouble, so to speak. 01:07:21.000 --> 01:07:25.000 So that's when we're comparing these 2 and these 2 timelines are helpful. 01:07:25.000 --> 01:07:32.000 So how is this change in the information environment? As the information environment has changed, we can see posting on Geo. 01:07:32.000 --> 01:07:36.000 Cities is quite different than Kovat. 19 dashboards and Twitter. 01:07:36.000 --> 01:07:40.000 Then, if we look at Geo. cities, we see geocities as an early community right? 01:07:40.000 --> 01:07:45.000 This move from Usenet to Geo cities where people could self-publish? 01:07:45.000 --> 01:07:49.000 I mean, it was very difficult to self-publish by comparison versus these Covid. 01:07:49.000 --> 01:07:53.000 19 dashboards are published by many official sources, so at least in the United States. 01:07:53.000 --> 01:07:57.000 Every single one of the States in the United States has a covid dashboard as well as national dashboards. 01:07:57.000 --> 01:08:03.000 We have international dashboards. So these dashboards were kind of the way that we put our finger on what Covid was doing. 01:08:03.000 --> 01:08:13.000 But we can see kind of Geo cities plus Twitter and Covid dashboards kind of give this idea of how folks were talking about this. 01:08:13.000 --> 01:08:18.000 So what resources are we using to do this the first we're using Is the Geo. 01:08:18.000 --> 01:08:32.000 Cities archive from the Internet. archive and so we're using this to search for mentions of hiv aids and other keywords to understand what what circulating how hiv and aids were projected and you can 01:08:32.000 --> 01:08:38.000 See here's an example where allo is used considered a treatment potentially for Hiv. 01:08:38.000 --> 01:08:43.000 We're also using the iapc covid 19 archive, which Valerie already introduced. 01:08:43.000 --> 01:08:54.000 So I won't go deep into that but we're using this to look at these covid 19 dashboards, and we can see this is an example of the arizona covid 19 dashboard right here and you can 01:08:54.000 --> 01:09:00.000 see this is loading it through the Internet Archive, and you can see that this gives us an error page, right? 01:09:00.000 --> 01:09:03.000 And this is fairly common because many of these dashboards are quite complex. 01:09:03.000 --> 01:09:12.000 They're developed in tableau or other dashboard services that are resistant is probably an understatement to archiving, so we have pieces of them. 01:09:12.000 --> 01:09:20.000 We have javascript that's embedded in the pages that sometimes contains data, so just because the pages won't load that doesn't mean there isn't any sort of data behind that So 01:09:20.000 --> 01:09:23.000 we're interested in. How do we extract that data So you see, in this example? 01:09:23.000 --> 01:09:29.000 This is an error page that you receive when you look at the dashboards we see this is fairly common. 01:09:29.000 --> 01:09:39.000 Other ones other dashboards might load with different temporal data at so and say, you know it says December the eleventh, but it actually loads data from 2 weeks before, and 2 weeks after. 01:09:39.000 --> 01:09:50.000 So it's this kind of mishmash of a dashboard that actually doesn't represent represent that one day can see. this is what the live dashboard looks like. on the Arizona department of health website So you can see 01:09:50.000 --> 01:09:57.000 that error message. what's supposed to be there is actually a map of the counties in the State of Arizona, as well as the number of cases, and such. 01:09:57.000 --> 01:10:00.000 So you can see the differences between that so briefly. 01:10:00.000 --> 01:10:09.000 I just want to give you some results that we found. and this is going to mainly come from our narrative analysis, mostly of Covid, sorry, mostly of Geo. 01:10:09.000 --> 01:10:14.000 Cities and Hiv. But i'm going to talk about how that contrasts a little bit with with Covid 19. 01:10:14.000 --> 01:10:23.000 So we found these really 4 categories of misinformation in these Geo. 01:10:23.000 --> 01:10:35.000 City pages about Hiv. So we find this idea of supportive like information for treating Hiv symptoms, infections. and this is incorrect information, and we also have to understand We have to contextualize this at the 01:10:35.000 --> 01:10:41.000 time. So what did we know in the late ninetys to early 2 thousands versus What do we know now? 01:10:41.000 --> 01:10:52.000 So what might have been accurate information at that moment in time in the late ninetys or early 2 thousands we might actually be not in action information now, but it wasn't misinformation back then, but so we're trying to 01:10:52.000 --> 01:10:57.000 contextualize that we also see preventative right information offering methods of Hiv prevention. 01:10:57.000 --> 01:11:00.000 We see debunking. people are working to debunk and say, this is misinformation. 01:11:00.000 --> 01:11:09.000 These are misconceptions. This is incorrect, and then we see this other third category that's basically homophobic and hate-filled content. 01:11:09.000 --> 01:11:21.000 So these 4 categories actually kind of hold and we see that we see these supportive debunking and preventative categories as really these attempts to fill information voids. we're missing scientific information So someone has 01:11:21.000 --> 01:11:26.000 to fill it. we also don't see the government conspiracy theories that we see now. 01:11:26.000 --> 01:11:30.000 We? right now we see this heavily politicized in our Covid data. right? 01:11:30.000 --> 01:11:33.000 We see this heavily politicized, like anti-government, like lack of trust in government. 01:11:33.000 --> 01:11:39.000 Well, we don't see that in the the geo cities data, but of course, that also these are all these archives are incomplete. 01:11:39.000 --> 01:11:49.000 So we might be missing some of that information. But we see this kind of contrast that Covid takes those 4 categories and then adds: This conspiracy theory, category, and the hate of course in kovat 01:11:49.000 --> 01:11:52.000 is not homophobic. The hate is like vaccinations. 01:11:52.000 --> 01:11:59.000 Other kinds of things. And then, finally, we really see all this misinformation, not as attempts to harm people. 01:11:59.000 --> 01:12:02.000 But we see the vast majority of this as this idea of community sense making. 01:12:02.000 --> 01:12:07.000 So how do we make sense of the current pandemic whether it's Hiv. 01:12:07.000 --> 01:12:13.000 Whether it's covid How do we provide support information and so these are our preliminary results. 01:12:13.000 --> 01:12:23.000 At this moment, in time moving forward, we're looking at those dashboards. We're looking at linking patterns and other things. And thank you If you have questions feel free to email me or reach out to any of the folks on 01:12:23.000 --> 01:12:28.000 the group. and thank you very much, thank you mike thank you so much, Sean. 01:12:28.000 --> 01:12:33.000 We do have a question that that's come in so it's about timeing. 01:12:33.000 --> 01:12:40.000 So the the rise of hiv emerged Pre-internet And so i'm, wondering if there's anything that you found in your research. 01:12:40.000 --> 01:12:52.000 Is there a difference in the news media and the way that people were communicating in the start of Hiv Versus in the start of Covid that might have led to some of those differences like i'm thinking with the the lack of conspiracy 01:12:52.000 --> 01:13:03.000 theories like where those did those only happen in print no Well, that's an interesting question, because that's one of the things you've been thinking about, because some of our colleagues in the project are journalists and so we can 01:13:03.000 --> 01:13:10.000 see that, you know this was joCD's time that was whenever newspapers were just starting to have websites, so we actually don't see a lot of news linking. 01:13:10.000 --> 01:13:14.000 We do see copying and pasting of news stories like someone actually typed it up and put it in. 01:13:14.000 --> 01:13:18.000 We'd see a little bit of that but the news doesn't play a central role. 01:13:18.000 --> 01:13:20.000 Also the let. There's a lack of news coverage around Hiv. 01:13:20.000 --> 01:13:26.000 It was really ignored at that time, and we see that a huge difference. Good. 01:13:26.000 --> 01:13:32.000 We see a lot of news linking, but then we also see illegitimate news sites that you know mainly traffic and misinformation. 01:13:32.000 --> 01:13:39.000 So we see news is used as a different sort of weapon within Covid that it's not used in Hiv. Interesting. 01:13:39.000 --> 01:13:44.000 Thank you for that. i'm looking here, I see that we have additional questions. 01:13:44.000 --> 01:13:52.000 So let's let's go into our open q and A and I have a couple for individuals, and then I have some for the group as a as a whole. 01:13:52.000 --> 01:13:57.000 And also folks in the room. If you have additional questions please do drop them into the to the Q. and A. 01:13:57.000 --> 01:14:01.000 And we'll we'll feel them for the next maybe 5 min or so. 01:14:01.000 --> 01:14:06.000 And I see one here for Robert asking, in your analysis of commenting platforms used. 01:14:06.000 --> 01:14:18.000 Did you all find that organizations were just actively pivoting away from disgust, maybe towards more platforms, with better administrative features or getting rid of comment sections completely. 01:14:18.000 --> 01:14:25.000 Robert, can you anything you can field there, yeah so we see a bit of both? 01:14:25.000 --> 01:14:32.000 It's hard to say something about general trends without typing deep into the data itself. 01:14:32.000 --> 01:14:42.000 But there are definitely individual sites that switch away from discuss towards something like Facebook comments. 01:14:42.000 --> 01:14:51.000 And the advantage for a platform perspective is that discuss offers a degree of anonymity to the user. 01:14:51.000 --> 01:14:58.000 And then anonymized. The users on the Internet, of course, can be very truly leaf on the software. 01:14:58.000 --> 01:15:13.000 Facebook, you are expected to use your real name so there is a degree of accountability there, and other outlets like Cnn Com Currently, They don't offer any commenting functionality anymore. 01:15:13.000 --> 01:15:20.000 But also we see stuff like that the guardian I don't think they use disgust. 01:15:20.000 --> 01:15:23.000 They use their own proprietary piece of technology. 01:15:23.000 --> 01:15:31.000 We, but they only have their columns open for a limit to time and only on opinion pieces. 01:15:31.000 --> 01:15:38.000 Yeah, I I have noticed the comment section just sort of leaving new sites almost almost entirely. 01:15:38.000 --> 01:15:44.000 It feels like social media has become the the replacement for for comments. 01:15:44.000 --> 01:15:54.000 I have a question here for for everyone and anyone who would any other panelists who would be interested in in addressing this from the co-work program. 01:15:54.000 --> 01:15:58.000 The question is, has the cohort program advanced your research? 01:15:58.000 --> 01:16:10.000 And if so, how anyone want to take a take a stab at that, Valerie? 01:16:10.000 --> 01:16:33.000 Go for it. Yeah, I would say, Yeah, of course it has advanced or research, and notably the fact that we have regularly every 2 weeks a meeting with the I have English team is very helpful as we can also ask a question and 01:16:33.000 --> 01:16:42.000 find solutions to technical issues are also brainstorm together, because they have plenty of answers, but not necessarily all answers. 01:16:42.000 --> 01:16:52.000 And so there is also a real feedback which is enriching, or so the way they may develop the interface and address a research question. 01:16:52.000 --> 01:17:01.000 And from all sides it's very useful and the interface is also so user-friendly that it's really. 01:17:01.000 --> 01:17:07.000 Yeah, I would say the next step for researchers looking into into web archives. 01:17:07.000 --> 01:17:14.000 So. yeah, of course it's very useful and if I call apply for next procure program. 01:17:14.000 --> 01:17:20.000 I will definitively, but I think I should give the opportunity also towards our researchers. 01:17:20.000 --> 01:17:31.000 So that's great. anyone else like to feel that gotchuman. 01:17:31.000 --> 01:17:39.000 Yes, I mean I have a fondness for the archives leash program, because I started working with them when I was a PHD. 01:17:39.000 --> 01:17:43.000 Student many moons ago, and that's how I learned about web archives, and found this community. 01:17:43.000 --> 01:17:48.000 But the really amazing part about the cohort is that program is that there are resources provided. 01:17:48.000 --> 01:17:59.000 So there's technical expertise, some of the team members have written some of the seminal papers on looking at DoCD's web archives, and they, under their disciplinary so they understand how to have conversations to help 01:17:59.000 --> 01:18:01.000 members of like my team have not worked for the web archives before I'm. 01:18:01.000 --> 01:18:10.000 The only one who's ever done that and they've stepped us through this process and and held our hands and provided advice methodologically technically close with us. 01:18:10.000 --> 01:18:16.000 Whatever we have difficulties working with data and things don't go as well. 01:18:16.000 --> 01:18:21.000 It's just a really supportive environment to be able to get actually a lot of work done. 01:18:21.000 --> 01:18:24.000 And work with really some some of the top folks in web archives in the world. 01:18:24.000 --> 01:18:31.000 Here's a here's another question then from from Abbey in the Q. 01:18:31.000 --> 01:18:40.000 And a for everyone we're the others of you already working with a web archives to some extent before working on these projects with archives unleashed? 01:18:40.000 --> 01:18:44.000 Or were they pretty new to you as researchers? And and what led you to web archives? 01:18:44.000 --> 01:18:49.000 I think people would love to hear from you on, from all of you on on this topic. 01:18:49.000 --> 01:18:56.000 Who wants to go first, Valerie, you want to do. 01:18:56.000 --> 01:19:03.000 You want to take a shot at that a team was raising his end, so perhaps team, and then I will. 01:19:03.000 --> 01:19:10.000 I will follow if you agree, sounds great. Go for a Tim unmute. There we go. 01:19:10.000 --> 01:19:19.000 I think one of the great things about this program for us is that we had a subscription to our event. 01:19:19.000 --> 01:19:24.000 We've been saving content into it but we didn't really have any researchers jump in to use any of it. 01:19:24.000 --> 01:19:31.000 So this was a perfect opportunity for us to produce that data set, and then also make use of it in a research initiative. 01:19:31.000 --> 01:19:35.000 So I think it's a very nice narrative those couple pieces put together. 01:19:35.000 --> 01:19:51.000 That's great thanks, Jim. How about you Valerie so I was not new in the field of web archives? But I discovered new challenges, and the quote program is a very good opportunity to work collectively, which is very important it's 01:19:51.000 --> 01:19:55.000 the first thing, and to work deeper into a distant treating. 01:19:55.000 --> 01:20:08.000 I was more, someone was into close reading of web archives or a mixture. and no, I'm also yeah, experimenting a new way of researching with not books with new software. 01:20:08.000 --> 01:20:17.000 And so on. and of course here do we have access to data, thanks to all these partnerships to Ipc are private. the team. 01:20:17.000 --> 01:20:27.000 It's also very valuable of course would anyone else like to throw in on this question. 01:20:27.000 --> 01:20:35.000 Yeah. So before the archives on list project the power program, we were just manually scraping. 01:20:35.000 --> 01:20:39.000 Basically the Internet archive way back machine manually clicking through it. 01:20:39.000 --> 01:20:45.000 So it really helped us with access to the archives and really think about all right. 01:20:45.000 --> 01:20:56.000 What do we want in our data set? and the other it's very often to have a team diet to think with you that knows what's technically capable. 01:20:56.000 --> 01:21:00.000 But also was one that's because there's always this give and take about all right. 01:21:00.000 --> 01:21:04.000 We want all the data. Well, you can physically store then your computer. 01:21:04.000 --> 01:21:11.000 So you need to make a subset that is intelligent and manageable while still being representative. 01:21:11.000 --> 01:21:30.000 And I think that really helps that's great shina Yeah, I think that what was interesting for us was we were coming from a place where we were doing this sort of Low-stakes personal archiving. 01:21:30.000 --> 01:21:36.000 We were calling that a small data collection which had this really kind of intimate relationship with what we were trying to capture on the web. 01:21:36.000 --> 01:21:42.000 And it was largely social media based. And so this was a really great kind of shift for us into thinking. 01:21:42.000 --> 01:21:48.000 What can big data do, which is something that we're very skeptical of as people who subscribe and think through kind of data. 01:21:48.000 --> 01:22:02.000 Feminism as a principle and it's given us this whole other set to work with and and again in that really supportive mentoring space to just start to play in these mixed method weighs between small and big data and 01:22:02.000 --> 01:22:06.000 and to see Well, what are the kind of questions that come out when you do make that interaction between the 2. 01:22:06.000 --> 01:22:14.000 And so that's been a really kind of exciting advancement in the research. that's great the looking at our time. 01:22:14.000 --> 01:22:25.000 We're winding down here today. But I have one final question, and this one is for Sam, and the question is, Will the cohort one folks be a resource for the second coordinate? 01:22:25.000 --> 01:22:41.000 I would say absolutely so. I mean as the panelists have shown there's a really wide range in terms of not only topics, but you know the experiences and expertise that they're bringing to each of their projects and I think carrying 01:22:41.000 --> 01:22:49.000 that forward to the next program is going to be really important as a starting place for these new cohort groups. 01:22:49.000 --> 01:22:55.000 And just, I guess, getting them thinking about. You know some of the available opportunities that they already have. 01:22:55.000 --> 01:23:03.000 And you know, having that inspiration from these, this first round of groups which has been absolutely fantastic. 01:23:03.000 --> 01:23:14.000 Well, thank you so much Sam, for that, and a big thank you to all the archives unleashed panelists, and to Quinn at the start. 01:23:14.000 --> 01:23:22.000 So what I want to do here is kind of wind us down and tell us, tell everybody about what's coming next for our series. 01:23:22.000 --> 01:23:26.000 So this series was guided by an excellent team of advisors. 01:23:26.000 --> 01:23:32.000 Our libraries laboratory series, and they all come with a big background in digital humanity scholarship. 01:23:32.000 --> 01:23:37.000 So we have Dan Cohen from Northeastern University, Makiba Foster. 01:23:37.000 --> 01:23:40.000 Why I had the privilege of working with at Washington University. 01:23:40.000 --> 01:23:47.000 Now she's now at Broward County Library, Mike furlough from Auntie Trust and Harriet Green from Washington University in St. 01:23:47.000 --> 01:24:03.000 Louis, who is a digital scholarship library, and so a thanks to all of those community members who helped guide the series and our sessions. So if you have a research project, and you're interested in telling a little bit more 01:24:03.000 --> 01:24:14.000 about it, please, we have an opportunity for you we're doing a lightning talk session and our final series our final session of the series that'll be on may the eleventh it'll be quick series of 01:24:14.000 --> 01:24:25.000 talks 2 and a half to 3 min long at the most, and we do have submission guidelines that I think Duncan may be sharing with you in the in the chat. 01:24:25.000 --> 01:24:30.000 We already have 4 excellent projects that have submitted for that lightning. 01:24:30.000 --> 01:24:33.000 Talk round on May the eleventh, and looking to add a couple of more. 01:24:33.000 --> 01:24:38.000 So hope you if you have a project that uses content from the Internet archive that you'd like to show off. 01:24:38.000 --> 01:24:51.000 We'd love to to give you an opportunity to talk about that on that session on May the eleventh. So up next in our series, on in 2 weeks on March the thirtieth will be bringing you hundreds of books thousands of 01:24:51.000 --> 01:24:55.000 stories, a guide to the Internet Archives, African Folk Tales. 01:24:55.000 --> 01:24:58.000 And so on. the next slide you'll see we do have A. 01:24:58.000 --> 01:25:02.000 Qr. code that Qr. Code will if you scan that with your phone right now. 01:25:02.000 --> 01:25:04.000 That's going to take you directly to the registration for the next session. 01:25:04.000 --> 01:25:08.000 So go for it. I have a little bit of background here. 01:25:08.000 --> 01:25:11.000 I want to tell you about for the next session. 01:25:11.000 --> 01:25:15.000 So what we want you to do is join us to hear from educator and bibliography. 01:25:15.000 --> 01:25:18.000 Laura Gibbs and researcher, writer and artist, Helen N. 01:25:18.000 --> 01:25:26.000 Day as they give attendees a guided tour of the African folk tales at the Internet Archive and in our collection. 01:25:26.000 --> 01:25:38.000 So Laura is going to share the Bibliographic work that she's done to to write up self published work, including a reader's guide to the African folk tales at the Internet archive and 01:25:38.000 --> 01:25:43.000 she's gonna tell all of our attendees how they can do the same. 01:25:43.000 --> 01:25:57.000 Then then, Helen is gonna tell us about the the the communities and the individuals that are often Aren't represented online, and how she weaves their stories into the online world using Twitter and other technologies. 01:25:57.000 --> 01:26:06.000 And also this is just really fascinating of How she's using technology to continue the African storytelling tradition in spoken form. 01:26:06.000 --> 01:26:10.000 So I know you'll all want to join in for that session in 2 weeks. 01:26:10.000 --> 01:26:17.000 It's gonna be really good I've had the privilege of talking both with with Laura and with Helen, and they are just gonna knock your socks off. 01:26:17.000 --> 01:26:21.000 They're really gonna give a great presentation so I hope you all can join in for that. 01:26:21.000 --> 01:26:35.000 So as I wind down here today, I a final reminder that we will be sending out an email to everyone with the link to the session recording and to all of the resources that we've shared in chat today on behalf of the 01:26:35.000 --> 01:26:41.000 Internet Archive and our presenters at archives unleashed and saving Ukrainian cultural heritage online. 01:26:41.000 --> 01:26:45.000 I like to thank you for your time and your participation today. 01:26:45.000 --> 01:27:00.000 I hope you to see you at our next session on March the Thirtieth, or one of the later talks in our library as Laboratory Series.